Overview

Dataset statistics

Number of variables13
Number of observations2774
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory281.9 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Alerts

gross_revenue is highly correlated with qtd_invoices and 3 other fieldsHigh correlation
qtd_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtd_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtd_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtd_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtd_invoices and 1 other fieldsHigh correlation
qtd_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtd_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qtd_products is highly correlated with qtd_invoicesHigh correlation
avg_ticket is highly correlated with returns and 1 other fieldsHigh correlation
returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtd_invoices and 2 other fieldsHigh correlation
qtd_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtd_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtd_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qtd_itemsHigh correlation
avg_unique_basket_size is highly correlated with qtd_productsHigh correlation
df_index is highly correlated with avg_recency_daysHigh correlation
gross_revenue is highly correlated with qtd_invoices and 5 other fieldsHigh correlation
qtd_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtd_items is highly correlated with gross_revenue and 5 other fieldsHigh correlation
qtd_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with df_indexHigh correlation
returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtd_productsHigh correlation
avg_ticket is highly skewed (γ1 = 51.90076423) Skewed
frequency is highly skewed (γ1 = 46.08539806) Skewed
returns is highly skewed (γ1 = 50.10197766) Skewed
avg_basket_size is highly skewed (γ1 = 44.86093386) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.2%) zeros Zeros
returns has 1481 (53.4%) zeros Zeros

Reproduction

Analysis started2022-08-04 11:41:06.571809
Analysis finished2022-08-04 11:42:17.735780
Duration1 minute and 11.16 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2251.237203
Minimum0
Maximum5696
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:18.005682image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile181.65
Q1901.5
median2061.5
Q33411.25
95-th percentile4958.85
Maximum5696
Range5696
Interquartile range (IQR)2509.75

Descriptive statistics

Standard deviation1526.597887
Coefficient of variation (CV)0.6781150763
Kurtosis-0.956310095
Mean2251.237203
Median Absolute Deviation (MAD)1241
Skewness0.3794934938
Sum6244932
Variance2330501.11
MonotonicityStrictly increasing
2022-08-04T08:42:18.408899image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
29101
 
< 0.1%
28961
 
< 0.1%
28971
 
< 0.1%
29001
 
< 0.1%
29011
 
< 0.1%
29051
 
< 0.1%
29061
 
< 0.1%
29071
 
< 0.1%
29081
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
56961
< 0.1%
56861
< 0.1%
56801
< 0.1%
56551
< 0.1%
56491
< 0.1%
56381
< 0.1%
56371
< 0.1%
56211
< 0.1%
56201
< 0.1%
56111
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2774
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15285.69971
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:18.871943image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12626.65
Q113815.25
median15242.5
Q316779.75
95-th percentile17950.35
Maximum18287
Range5940
Interquartile range (IQR)2964.5

Descriptive statistics

Standard deviation1714.984904
Coefficient of variation (CV)0.1121953811
Kurtosis-1.206915065
Mean15285.69971
Median Absolute Deviation (MAD)1483.5
Skewness0.01599078757
Sum42402531
Variance2941173.222
MonotonicityNot monotonic
2022-08-04T08:42:19.401601image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
178501
 
< 0.1%
144821
 
< 0.1%
170581
 
< 0.1%
177041
 
< 0.1%
169331
 
< 0.1%
137721
 
< 0.1%
162491
 
< 0.1%
141981
 
< 0.1%
139891
 
< 0.1%
179301
 
< 0.1%
Other values (2764)2764
99.6%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182651
< 0.1%
182631
< 0.1%
182611
< 0.1%
182601
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2760
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2904.751532
Minimum36.56
Maximum279138.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:19.864014image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum36.56
5-th percentile264.557
Q1628.9125
median1170.87
Q32424.715
95-th percentile7579.4915
Maximum279138.02
Range279101.46
Interquartile range (IQR)1795.8025

Descriptive statistics

Standard deviation10927.21927
Coefficient of variation (CV)3.761843017
Kurtosis331.9508666
Mean2904.751532
Median Absolute Deviation (MAD)688.765
Skewness16.26093044
Sum8057780.75
Variance119404120.9
MonotonicityNot monotonic
2022-08-04T08:42:20.410721image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1025.442
 
0.1%
745.062
 
0.1%
598.22
 
0.1%
1078.962
 
0.1%
731.92
 
0.1%
1353.742
 
0.1%
2053.022
 
0.1%
379.652
 
0.1%
1314.452
 
0.1%
3312
 
0.1%
Other values (2750)2754
99.3%
ValueCountFrequency (%)
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
68.841
< 0.1%
70.021
< 0.1%
77.41
< 0.1%
84.651
< 0.1%
90.31
< 0.1%
93.351
< 0.1%
ValueCountFrequency (%)
279138.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
140450.721
< 0.1%
124564.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
72882.091
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

ZEROS

Distinct252
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.62689257
Minimum0
Maximum372
Zeros34
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:20.877210image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q110
median29
Q373
95-th percentile211
Maximum372
Range372
Interquartile range (IQR)63

Descriptive statistics

Standard deviation68.41964137
Coefficient of variation (CV)1.208253504
Kurtosis3.432018391
Mean56.62689257
Median Absolute Deviation (MAD)23.5
Skewness1.898344739
Sum157083
Variance4681.247326
MonotonicityNot monotonic
2022-08-04T08:42:21.479330image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.6%
487
 
3.1%
285
 
3.1%
385
 
3.1%
876
 
2.7%
1067
 
2.4%
966
 
2.4%
765
 
2.3%
1762
 
2.2%
2255
 
2.0%
Other values (242)2027
73.1%
ValueCountFrequency (%)
034
 
1.2%
199
3.6%
285
3.1%
385
3.1%
487
3.1%
543
1.6%
765
2.3%
876
2.7%
966
2.4%
1067
2.4%
ValueCountFrequency (%)
3721
 
< 0.1%
3661
 
< 0.1%
3601
 
< 0.1%
3583
0.1%
3541
 
< 0.1%
3371
 
< 0.1%
3362
0.1%
3341
 
< 0.1%
3332
0.1%
3301
 
< 0.1%

qtd_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct55
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.053352559
Minimum2
Maximum206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:21.921350image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q12
median4
Q36
95-th percentile17
Maximum206
Range204
Interquartile range (IQR)4

Descriptive statistics

Standard deviation9.071461768
Coefficient of variation (CV)1.498584739
Kurtosis183.9551027
Mean6.053352559
Median Absolute Deviation (MAD)2
Skewness10.62505905
Sum16792
Variance82.29141862
MonotonicityNot monotonic
2022-08-04T08:42:22.367576image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
Other values (45)278
 
10.0%
ValueCountFrequency (%)
2780
28.1%
3499
18.0%
4393
14.2%
5237
 
8.5%
6173
 
6.2%
7138
 
5.0%
898
 
3.5%
969
 
2.5%
1055
 
2.0%
1154
 
1.9%
ValueCountFrequency (%)
2061
< 0.1%
1991
< 0.1%
1241
< 0.1%
971
< 0.1%
912
0.1%
861
< 0.1%
721
< 0.1%
622
0.1%
601
< 0.1%
571
< 0.1%

qtd_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1639
Distinct (%)59.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1700.379957
Minimum2
Maximum196844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:22.836767image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile119.65
Q1330.25
median705.5
Q31478.75
95-th percentile4645.5
Maximum196844
Range196842
Interquartile range (IQR)1148.5

Descriptive statistics

Standard deviation6079.161482
Coefficient of variation (CV)3.575178276
Kurtosis437.6447231
Mean1700.379957
Median Absolute Deviation (MAD)453.5
Skewness17.32001834
Sum4716854
Variance36956204.33
MonotonicityNot monotonic
2022-08-04T08:42:23.349683image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31011
 
0.4%
2468
 
0.3%
1508
 
0.3%
2197
 
0.3%
2007
 
0.3%
3007
 
0.3%
4937
 
0.3%
12007
 
0.3%
2727
 
0.3%
2607
 
0.3%
Other values (1629)2698
97.3%
ValueCountFrequency (%)
21
< 0.1%
161
< 0.1%
171
< 0.1%
191
< 0.1%
201
< 0.1%
251
< 0.1%
272
0.1%
301
< 0.1%
321
< 0.1%
332
0.1%
ValueCountFrequency (%)
1968441
< 0.1%
809971
< 0.1%
802631
< 0.1%
773731
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

qtd_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct467
Distinct (%)16.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean129.7433309
Minimum2
Maximum7838
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:23.899761image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile10
Q134
median72
Q3143
95-th percentile400.05
Maximum7838
Range7836
Interquartile range (IQR)109

Descriptive statistics

Standard deviation277.7854086
Coefficient of variation (CV)2.141038053
Kurtosis336.8230491
Mean129.7433309
Median Absolute Deviation (MAD)45
Skewness15.34866005
Sum359908
Variance77164.73323
MonotonicityNot monotonic
2022-08-04T08:42:24.395519image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2838
 
1.4%
3534
 
1.2%
2730
 
1.1%
2630
 
1.1%
2930
 
1.1%
1527
 
1.0%
1927
 
1.0%
2527
 
1.0%
3127
 
1.0%
3326
 
0.9%
Other values (457)2478
89.3%
ValueCountFrequency (%)
211
0.4%
313
0.5%
416
0.6%
516
0.6%
624
0.9%
714
0.5%
813
0.5%
919
0.7%
1019
0.7%
1123
0.8%
ValueCountFrequency (%)
78381
< 0.1%
56731
< 0.1%
50951
< 0.1%
45801
< 0.1%
26981
< 0.1%
23791
< 0.1%
20601
< 0.1%
18181
< 0.1%
16731
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2772
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52.33677308
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:24.833243image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.852702153
Q112.42379049
median17.94212763
Q325.07465812
95-th percentile88.42744262
Maximum56157.5
Range56155.34941
Interquartile range (IQR)12.65086763

Descriptive statistics

Standard deviation1071.049203
Coefficient of variation (CV)20.46456325
Kurtosis2718.321218
Mean52.33677308
Median Absolute Deviation (MAD)6.338589039
Skewness51.90076423
Sum145182.2085
Variance1147146.395
MonotonicityNot monotonic
2022-08-04T08:42:25.271538image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14.478333332
 
0.1%
4.1622
 
0.1%
6.2697008551
 
< 0.1%
32.597751
 
< 0.1%
19.030483871
 
< 0.1%
28.554516131
 
< 0.1%
12.800681821
 
< 0.1%
6.3962146891
 
< 0.1%
26.087971011
 
< 0.1%
17.984615381
 
< 0.1%
Other values (2762)2762
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5112413791
< 0.1%
2.5153333331
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7704641911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
872.131
< 0.1%
841.02144931
< 0.1%
651.16833331
< 0.1%
6401
< 0.1%
624.41
< 0.1%
615.751
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1155
Distinct (%)41.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean78.79449884
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:25.688323image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13
Q134.14930556
median59
Q399
95-th percentile224
Maximum366
Range365
Interquartile range (IQR)64.85069444

Descriptive statistics

Standard deviation66.52001781
Coefficient of variation (CV)0.844221599
Kurtosis3.673385052
Mean78.79449884
Median Absolute Deviation (MAD)30
Skewness1.828126135
Sum218575.9398
Variance4424.912769
MonotonicityNot monotonic
2022-08-04T08:42:26.124286image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7021
 
0.8%
4618
 
0.6%
5517
 
0.6%
3116
 
0.6%
9116
 
0.6%
4916
 
0.6%
2115
 
0.5%
4215
 
0.5%
3515
 
0.5%
2614
 
0.5%
Other values (1145)2611
94.1%
ValueCountFrequency (%)
19
0.3%
24
0.1%
2.8615384621
 
< 0.1%
36
0.2%
3.3303571431
 
< 0.1%
3.3513513511
 
< 0.1%
45
0.2%
4.1910112361
 
< 0.1%
4.2758620691
 
< 0.1%
4.51
 
< 0.1%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3641
 
< 0.1%
3631
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1225
Distinct (%)44.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04969870057
Minimum0.005449591281
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:26.603006image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.008746355685
Q10.01575839204
median0.0243902439
Q30.04166666667
95-th percentile0.1153846154
Maximum17
Range16.99455041
Interquartile range (IQR)0.02590827462

Descriptive statistics

Standard deviation0.337595074
Coefficient of variation (CV)6.792835026
Kurtosis2296.516337
Mean0.04969870057
Median Absolute Deviation (MAD)0.01069454458
Skewness46.08539806
Sum137.8641954
Variance0.113970434
MonotonicityNot monotonic
2022-08-04T08:42:27.107834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.062518
 
0.6%
0.0277777777817
 
0.6%
0.0238095238116
 
0.6%
0.0833333333315
 
0.5%
0.0909090909115
 
0.5%
0.0294117647114
 
0.5%
0.0344827586214
 
0.5%
0.0192307692313
 
0.5%
0.0256410256413
 
0.5%
0.0212765957413
 
0.5%
Other values (1215)2626
94.7%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054794520551
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
171
 
< 0.1%
31
 
< 0.1%
21
 
< 0.1%
1.1428571431
 
< 0.1%
18
0.3%
0.751
 
< 0.1%
0.66666666673
 
0.1%
0.5508021391
 
< 0.1%
0.53351206431
 
< 0.1%
0.53
 
0.1%

returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct205
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.15897621
Minimum0
Maximum80995
Zeros1481
Zeros (%)53.4%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:27.606347image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q39
95-th percentile98
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1564.393524
Coefficient of variation (CV)24.38308116
Kurtosis2586.254065
Mean64.15897621
Median Absolute Deviation (MAD)0
Skewness50.10197766
Sum177977
Variance2447327.097
MonotonicityNot monotonic
2022-08-04T08:42:28.084063image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
663
 
2.3%
555
 
2.0%
1245
 
1.6%
839
 
1.4%
938
 
1.4%
Other values (195)653
23.5%
ValueCountFrequency (%)
01481
53.4%
1129
 
4.7%
2117
 
4.2%
382
 
3.0%
472
 
2.6%
555
 
2.0%
663
 
2.3%
738
 
1.4%
839
 
1.4%
938
 
1.4%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80041
< 0.1%
44271
< 0.1%
37681
< 0.1%
33321
< 0.1%
28781
< 0.1%
20221
< 0.1%
20121
< 0.1%
17761
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1938
Distinct (%)69.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean245.961992
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:28.500539image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45
Q1103.3333333
median172.125
Q3278.2375
95-th percentile587.875
Maximum40498.5
Range40497.5
Interquartile range (IQR)174.9041667

Descriptive statistics

Standard deviation808.0807949
Coefficient of variation (CV)3.285388887
Kurtosis2223.352169
Mean245.961992
Median Absolute Deviation (MAD)81.29166667
Skewness44.86093386
Sum682298.5657
Variance652994.5711
MonotonicityNot monotonic
2022-08-04T08:42:28.914999image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
869
 
0.3%
758
 
0.3%
608
 
0.3%
2087
 
0.3%
1057
 
0.3%
827
 
0.3%
737
 
0.3%
1367
 
0.3%
1977
 
0.3%
Other values (1928)2696
97.2%
ValueCountFrequency (%)
11
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
111
< 0.1%
11.8751
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
3868.651
< 0.1%
28801
< 0.1%
2733.9444441
< 0.1%
2518.7692311
< 0.1%
2160.3333331
< 0.1%
2082.2258061
< 0.1%
20001
< 0.1%
1903.51
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct997
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.12196419
Minimum1
Maximum299.7058824
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.8 KiB
2022-08-04T08:42:29.308873image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q110.12708333
median17.2967033
Q328
95-th percentile56.6469697
Maximum299.7058824
Range298.7058824
Interquartile range (IQR)17.87291667

Descriptive statistics

Standard deviation18.86759007
Coefficient of variation (CV)0.8528894591
Kurtosis24.17737545
Mean22.12196419
Median Absolute Deviation (MAD)8.296703297
Skewness3.158633785
Sum61366.32865
Variance355.9859551
MonotonicityNot monotonic
2022-08-04T08:42:29.761207image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1344
 
1.6%
1430
 
1.1%
1129
 
1.0%
126
 
0.9%
926
 
0.9%
10.525
 
0.9%
7.525
 
0.9%
9.524
 
0.9%
17.524
 
0.9%
15.523
 
0.8%
Other values (987)2498
90.1%
ValueCountFrequency (%)
126
0.9%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.58
 
0.3%
1.5681818181
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
221
0.8%
ValueCountFrequency (%)
299.70588241
< 0.1%
203.51
< 0.1%
1451
< 0.1%
136.1251
< 0.1%
135.51
< 0.1%
1221
< 0.1%
1181
< 0.1%
1141
< 0.1%
110.33333331
< 0.1%
1101
< 0.1%

Interactions

2022-08-04T08:42:10.644113image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:11.504349image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:17.537683image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:22.370604image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:26.406891image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:30.443393image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:35.142593image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:40.023609image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:45.849534image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:50.303702image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:55.759285image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:00.471258image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:05.945312image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:10.956219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:11.836662image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:17.864274image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:22.664890image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:26.705458image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:30.769824image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:35.486205image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:40.904078image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:46.163376image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:50.671654image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:56.105424image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:00.905060image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:06.301731image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:11.295300image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:12.151434image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:18.193719image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:22.955484image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:26.993816image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:31.097390image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:35.824046image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:41.257845image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:46.485010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:51.040689image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:56.498092image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:01.316319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:06.653593image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:11.674519image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:12.575278image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:18.505793image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:23.259232image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:27.271448image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:31.411142image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:36.177823image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:41.685276image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:46.844760image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:51.471331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:56.851587image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:01.794794image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:06.978956image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:12.067959image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:12.905751image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:18.859578image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:23.575824image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:27.587151image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:31.741252image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:36.571892image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:42.143700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:47.181177image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:52.039302image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:57.208055image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:02.252203image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:07.366234image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:12.462633image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:13.244703image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:19.239319image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:23.894509image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:27.899413image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:32.091634image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:36.997625image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:42.570596image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:47.515504image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:52.496118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:57.595777image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:02.620456image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:07.729032image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:12.837268image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:13.689443image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:19.577193image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:24.250630image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:28.210862image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:32.434611image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:37.403767image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:43.029914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:47.870235image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:52.881387image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:57.960004image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:03.005224image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:08.070775image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:13.829733image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:14.151468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:19.962753image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:24.556769image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:28.540068image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:32.820881image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:37.784946image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:43.450734image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:48.198277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:53.230417image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:58.316972image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:03.500950image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:08.431412image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:14.157638image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:14.512362image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:20.333828image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:24.846322image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:28.829052image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:33.186104image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:38.120369image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:43.804514image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:48.542243image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:53.666022image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:58.652802image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:04.014016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:08.795277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:14.500469image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:14.854951image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:20.781834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:25.155740image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:29.150136image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:33.591155image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:38.511820image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:44.216800image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:48.868783image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:54.015875image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:59.003145image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:04.381736image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:09.165312image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:14.856521image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:16.540173image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:21.236665image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:25.471458image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:29.470078image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:33.967996image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:38.878275image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:44.620939image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:49.228915image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:54.417543image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:59.397290image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:04.780243image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:09.582123image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:15.284224image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:16.876140image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:21.692038image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:25.798192image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:29.789867image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:34.353394image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:39.269896image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:44.991873image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:49.606393image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:54.884216image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:59.778877image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:05.208152image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:09.939889image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:15.761748image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:17.220331image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:22.054756image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:26.122158image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:30.118579image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:34.758822image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:39.660088image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:45.440877image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:49.989073image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:41:55.343382image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:00.136160image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:05.584618image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2022-08-04T08:42:10.294802image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2022-08-04T08:42:30.161949image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-04T08:42:30.847178image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-04T08:42:31.433863image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-04T08:42:31.936973image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-04T08:42:16.584377image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-04T08:42:17.407167image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtd_invoicesqtd_itemsqtd_productsavg_ticketavg_recency_daysfrequencyreturnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.1522221.00000017.00000040.050.9705888.735294
11130473232.5956.09.01390.0171.018.90403552.8333330.02830235.0154.44444419.000000
22125836705.382.015.05028.0232.028.90250026.5000000.04032350.0335.20000015.466667
3313748948.2595.05.0439.028.033.86607192.6666670.0179210.087.8000005.600000
4415100876.00333.03.080.03.0292.00000020.0000000.07317122.026.6666671.000000
55152914623.3025.014.02102.0102.045.32647126.7692310.04011529.0150.1428577.285714
66146885630.877.021.03621.0327.017.21978619.2631580.057221399.0172.42857115.571429
77178095411.9116.012.02057.061.088.71983639.6666670.03352041.0171.4166675.083333
881531160767.900.091.038194.02379.025.5434644.1910110.243316474.0419.71428626.142857
99160982005.6387.07.0613.067.029.93477647.6666670.0243900.087.5714299.571429

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtd_invoicesqtd_itemsqtd_productsavg_ticketavg_recency_daysfrequencyreturnsavg_basket_sizeavg_unique_basket_size
2764561117290525.243.02.0404.0102.05.14941213.00.1428570.0202.00000051.0
276556201478577.4010.02.084.03.025.8000005.00.3333330.042.0000001.5
2766562117254272.444.02.0252.0112.02.43250011.00.1666670.0126.00000056.0
2767563717232421.522.02.0203.036.011.70888912.00.1538460.0101.50000018.0
2768563817468137.0010.02.0116.05.027.4000004.00.4000000.058.0000002.5
2769564913596697.045.02.0406.0166.04.1990367.00.2500000.0203.00000083.0
27705655148931237.859.02.0799.073.016.9568492.00.6666670.0399.50000036.5
2771568014126706.137.03.0508.015.047.0753333.00.75000050.0169.3333335.0
27725686135211092.391.03.0733.0435.02.5112414.50.3000000.0244.333333145.0
2773569615060301.848.04.0262.0120.02.5153331.02.0000000.065.50000030.0